Query Language for Access to Speech Corpora

نویسندگان

Andreas Mengel

Ulrich Heid

چکیده

With more and more speech corpora at hand the unit selection technique is a promising approach in concatenative speech synthesis. What is missing are models of optimal parameters that sufficiently describe utterances to be produced and their corresponding counterparts in collections of speech data. Prior to this, existing corpora have to be annotated on possibly relevant linguistic and signal levels. This paper deals with standards developed in the MATE project for the uniform annotation of speech corpora to be represented in XML and a query language which can access these corpora. These standards may accelerate the identification of optimal elements for the annotation and description of parameters relevant for the unit selection technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Language for Research in Phonetics

With the growing availability of spoken language corpora more and more data driven research in phonetics is possible. The downside of having huge speech corpora is that they have to be segmented and labeled, before they can be exploited. As labeling and annotation are time-consuming and costly, there is an interest in standardization which would support the exchange and reuse of labeled data. T...

متن کامل

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...

متن کامل

Large Linguistically-Processed Web Corpora for Multiple Languages

The Web contains vast amounts of linguistic data. One key issue for linguists and language technologists is how to access it. Commercial search engines give highly compromised access. An alternative is to crawl the Web ourselves, which also allows us to remove duplicates and nearduplicates, navigational material, and a range of other kinds of non-linguistic matter. We can also tokenize, lemmati...

متن کامل

graphANNIS: A Fast Query Engine for Deeply Annotated Linguistic Corpora

We present graphANNIS, a fast implementation of the established query language AQL for dealing with deeply annotated linguistic corpora. AQL builds on a graphbased abstraction for modeling and exchanging linguistic data, yet all its current implementations use relational databases as storage layer. In contrast, graphANNIS directly implements the ANNIS graph data model in main memory. We show th...

متن کامل

XSLT as a Linguistic Query Language

Introduction As the number of natural language applications being developed increases, so does the need for a good linguistic database management system. A linguistic database, more commonly known as a corpus, is a collection of linguistic data, either of written text or as a transcription of recorded speech. They are designed to be a balanced collection of data that represent some aspect of a ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Query Language for Access to Speech Corpora

نویسندگان

چکیده

منابع مشابه

Query Language for Research in Phonetics

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Large Linguistically-Processed Web Corpora for Multiple Languages

graphANNIS: A Fast Query Engine for Deeply Annotated Linguistic Corpora

XSLT as a Linguistic Query Language

عنوان ژورنال:

اشتراک گذاری